Muhammad Jaffri Mohd Nasir
2024-09-27
RCpp is not a popular package, but
still usefull for many reasons.
\(R\) language can be painfully slow when it comes to problems with large numbers of loop/iterations/recursions, accessing large elements in vectors/ matrices and applying mathematical operations. Accessing supercomputers to speed up \(R\) computational time can be impossible and costly. On the other hand, \(C\) language is known to be one of the fastest programming languages but might be overwhelming to code.
Introduced and maintained by Eddelbuettel et. al., the \(RCpp\) package provides easy integration and transition of objects between \(C\) and \(R\) languages through \(R\)’s application programming interface (API), thus mitigating bottlenecks and slowness in our code.
In this short talk, we will go through some basics of the \(RCpp\) that are absolutely necessary to help us mitigate bottlenecks in our code and introduce some of its useful extensions (e.g., RcppArmadillo). To motivate the audience, a few live applications are shown to demonstrate the speed of the \(RCpp\) package
library(Rcpp)
# Attaching package: 'Rcpp'
# The following object is masked from 'package:inline':
#registerPlugin
cppFunction(code =
'NumericVector fiboCpp(int n) {
NumericVector result(n);
if (n > 0) result[0] = 0;
if (n > 1) result[1] = 1;
for (int i = 2; i < n; i++) {
result[i] = result[i-1] + result[i-2];
}
return result;
}')
fiboCpp(100)*File named “fiboCpp.cpp” with the “.cpp” extension is exported.
\[F_{n} = F_{n-1} + F_{n-2}\] for \(n>1\), with conventions \(F_{0} = 0\) and \(F_{1} = 1\). It has several practical usage. Codes below are generated by Microsoft Copilot.
| R code | RCpp code (fiboCpp.cpp) |
|---|---|
Using R’s ‘rbenchmark’ package:
library(Rcpp)
cppFunction(code =
'List fiboCpp(int n) {
NumericVector result(n);
double resultFibo;
if (n > 0) result[0] = 0;
if (n > 1) result[1] = 1;
for (int i = 2; i < n; i++) {
result[i] = result[i-1] + result[i-2];
resultFibo = result[i];
}
return Rcpp::List::create(
Rcpp::Named("FiboSeq")=result,
Rcpp::Named("Fibo")=resultFibo);
}')
fiboCpp(100)Note that
Rcpp is designed to facilitate interfacing libraries written in C++ or C to R
RCppArmadillo enables C++ Armadillo package to load in R/RCpp for
complicated math operation
Many advance math function available:
- Similar package: Eigen,
RCppEigen
With Armadillo package, this enable us to include vector and matrices as variables via command “arma::vec” and “arma::mat”, respectively. For inline version, we have the following example:
library(Rcpp)
library(RcppArmadillo)
vecA<-c(1,2,3,4)
vecB<-c(4,5,6,7)
cppFunction(depends='RcppArmadillo', code =
'arma::mat MAT1(arma::vec veca, arma::vec vecb) {
arma::mat C;
C = veca*vecb.t();
return C;
}')
MAT1(vecA, vecB) #This C++ command generates 4 X 4 matrix from the multiplication of two vectors## [,1] [,2] [,3] [,4]
## [1,] 4 5 6 7
## [2,] 8 10 12 14
## [3,] 12 15 18 21
## [4,] 16 20 24 28
as.matrix(vecA)%*%t(as.matrix(vecB)) #This R command converts two vectors to matrices then multiplies## [,1] [,2] [,3] [,4]
## [1,] 4 5 6 7
## [2,] 8 10 12 14
## [3,] 12 15 18 21
## [4,] 16 20 24 28
| RCpp code |
|---|
Consider the following threshold autoregressive model: \[ \begin{split} y_{t}=\begin{cases} 1 -0.4 y_{t-1} +\varepsilon_{t},& \quad \text{if}\quad y_{t-1} \in (-\infty, -0.8],\\ 0.6 +y_{t-1}+\varepsilon_{t},& \quad \text{if}\quad y_{t-1} \in (-0.8, 0.5],\\ -1 -0.2y_{t-1} + \varepsilon_{t},& \quad \text{if}\quad y_{t-1} \in (0.5, \infty). \end{cases} \end{split} \\ \] with \(\varepsilon_{t}\sim N(0,1)\). The equation/system change as \(y_{t-1}\) breaches thresholds -0.8 and 0.5. We generate 1200 time points sequence starting \(t=1,\cdots,1200\).
We will utilizing two step threshold estimation approach, starting with group least angle regression (gLAR) algorithm and backward elimination algorithm (BEA).
We have two sets of code: - Fully R code - Hybrid of R and C++
For C++, RcppArmadillo is used because it involves a lot of importing and exporting vectors and matrices from R to C (and vice-versa).
Running R codes and hybrid codes for benchmarking. Here, we are obtaining 10 threshold candidates using gLAR algorithm, then filter the candidates using BEA. Left side is using full R code, Right side is using hybrid of R and C++ through RCpp